49 research outputs found

    Multimodal Modeling For Spoken Language Identification

    Full text link
    Spoken language identification refers to the task of automatically predicting the spoken language in a given utterance. Conventionally, it is modeled as a speech-based language identification task. Prior techniques have been constrained to a single modality; however in the case of video data there is a wealth of other metadata that may be beneficial for this task. In this work, we propose MuSeLI, a Multimodal Spoken Language Identification method, which delves into the use of various metadata sources to enhance language identification. Our study reveals that metadata such as video title, description and geographic location provide substantial information to identify the spoken language of the multimedia recording. We conduct experiments using two diverse public datasets of YouTube videos, and obtain state-of-the-art results on the language identification task. We additionally conduct an ablation study that describes the distinct contribution of each modality for language recognition

    Simian virus 40 vectors for pulmonary gene therapy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sepsis remains the leading cause of death in critically ill patients. One of the primary organs affected by sepsis is the lung, presenting as the Acute Respiratory Distress Syndrome (ARDS). Organ damage in sepsis involves an alteration in gene expression, making gene transfer a potential therapeutic modality. This work examines the feasibility of applying simian virus 40 (SV40) vectors for pulmonary gene therapy.</p> <p>Methods</p> <p>Sepsis-induced ARDS was established by cecal ligation double puncture (2CLP). SV40 vectors carrying the luciferase reporter gene (SV/<it>luc) </it>were administered intratracheally immediately after sepsis induction. Sham operated (SO) as well as 2CLP rats given intratracheal PBS or adenovirus expressing luciferase served as controls. Luc transduction was evaluated by <it>in vivo </it>light detection, immunoassay and luciferase mRNA detection by RT-PCR in tissue harvested from septic rats. Vector abundance and distribution into alveolar cells was evaluated using immunostaining for the SV40 VP1 capsid protein as well as by double staining for VP1 and for the surfactant protein C (proSP-C). Immunostaining for T-lymphocytes was used to evaluate the cellular immune response induced by the vector.</p> <p>Results</p> <p>Luc expression measured by <it>in vivo </it>light detection correlated with immunoassay from lung tissue harvested from the same rats. Moreover, our results showed vector presence in type II alveolar cells. The vector did not induce significant cellular immune response.</p> <p>Conclusion</p> <p>In the present study we have demonstrated efficient uptake and expression of an SV40 vector in the lungs of animals with sepsis-induced ARDS. These vectors appear to be capable of <it>in vivo </it>transduction of alveolar type II cells and may thus become a future therapeutic tool.</p

    Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

    Full text link
    We introduce the Universal Speech Model (USM), a single large model that performs automatic speech recognition (ASR) across 100+ languages. This is achieved by pre-training the encoder of the model on a large unlabeled multilingual dataset of 12 million (M) hours spanning over 300 languages, and fine-tuning on a smaller labeled dataset. We use multilingual pre-training with random-projection quantization and speech-text modality matching to achieve state-of-the-art performance on downstream multilingual ASR and speech-to-text translation tasks. We also demonstrate that despite using a labeled training set 1/7-th the size of that used for the Whisper model, our model exhibits comparable or better performance on both in-domain and out-of-domain speech recognition tasks across many languages.Comment: 20 pages, 7 figures, 8 table

    Optimized Spectrophotometry Method for Starch Quantification

    No full text
    Starch is a polysaccharide that is abundantly found in nature and is generally used as an energy source and energy storage in many biological and environmental processes. Naturally, starch tends to be in miniscule amounts, creating a necessity for quantitative analysis of starch in low-concentration samples. Existing studies that are based on the spectrophotometric detection of starch using the colorful amylose&ndash;iodine complex lack a detailed description of the analytical procedure and important parameters. In the present study, this spectrophotometry method was optimized, tested, and applied to studying starch content of atmospheric bioaerosols such as pollen, fungi, bacteria, and algae, whose chemical composition is not well known. Different experimental parameters, including pH, iodine solution concentrations, and starch solution stability, were tested, and method detection limit (MDL) and limit of quantification (LOQ) were determined at 590 nm. It was found that the highest spectrophotometry signal for the same starch concentration occurs at pH 6.0, with an iodine reagent concentration of 0.2%. The MDL was determined to be 0.22 &mu;g/mL, with an LOQ of 0.79 &mu;g/mL. This optimized method was successfully tested on bioaerosols and can be used to determine starch content in low-concentration samples. Starch content in bioaerosols ranged from 0.45 &plusmn; 0.05 (in bacteria) to 4.3 &plusmn; 0.06 &mu;g/mg (in fungi)

    Optimized Spectrophotometry Method for Starch Quantification

    No full text
    Starch is a polysaccharide that is abundantly found in nature and is generally used as an energy source and energy storage in many biological and environmental processes. Naturally, starch tends to be in miniscule amounts, creating a necessity for quantitative analysis of starch in low-concentration samples. Existing studies that are based on the spectrophotometric detection of starch using the colorful amylose–iodine complex lack a detailed description of the analytical procedure and important parameters. In the present study, this spectrophotometry method was optimized, tested, and applied to studying starch content of atmospheric bioaerosols such as pollen, fungi, bacteria, and algae, whose chemical composition is not well known. Different experimental parameters, including pH, iodine solution concentrations, and starch solution stability, were tested, and method detection limit (MDL) and limit of quantification (LOQ) were determined at 590 nm. It was found that the highest spectrophotometry signal for the same starch concentration occurs at pH 6.0, with an iodine reagent concentration of 0.2%. The MDL was determined to be 0.22 μg/mL, with an LOQ of 0.79 μg/mL. This optimized method was successfully tested on bioaerosols and can be used to determine starch content in low-concentration samples. Starch content in bioaerosols ranged from 0.45 ± 0.05 (in bacteria) to 4.3 ± 0.06 μg/mg (in fungi)

    The volatility of pollen extracts and their main constituents in aerosolized form via the integrated volume method (IVM) and the volatility basis set (VBS)

    No full text
    The volatility of organic aerosol in the atmosphere is an important quality that determines the aerosol/gas partitioning of compounds in the atmosphere and thus influences their ability to participate in gas-phase reactions in the atmosphere. In this research, the volatility of biological aerosols, specifically water-soluble pollen extracts and their chemical constituents, are studied for important thermodynamic properties such as saturation vapor concentration and latent heat of vaporization. The integrated volume method (IVM) was applied to characterize these properties for various free amino acids and saccharides in pollen, and the volatility basis set (VBS) approach was utilized to obtain a distribution of the mass fraction of pollen extracts with respect to saturation vapor concentration. Our results indicate that among seven compounds tested with the IVM, proline, γ-aminobutyric acid, and fructose had semivolatile saturation vapor concentrations of 17.5 ± 2.2, 14.7 ± 0.8, and 4.4 ± 0.5 μg m−3, respectively. Additionally, our VBS measurements indicate that aspen pollen extract contains a greater semivolatile mass fraction (up to 8.5% of total water-soluble mass) than lodgepole pine pollen (up to 2.2%), indicating that different pollen species may contribute to the total atmospheric semivolatile organic compound (SVOC) and low volatile organic compound (LVOC) budget differently. Depending on estimates of several factors, fluxes and concentrations of SVOCs and LVOCs from pollen could be comparable to other sources such as biomass burning and ambient urban emissions, though further research is needed to better constrain the contribution of pollen and other bioaerosols to organic compounds in the atmosphere. </p

    FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

    Full text link
    We introduce FLEURS, the Few-shot Learning Evaluation of Universal Representations of Speech benchmark. FLEURS is an n-way parallel speech dataset in 102 languages built on top of the machine translation FLoRes-101 benchmark, with approximately 12 hours of speech supervision per language. FLEURS can be used for a variety of speech tasks, including Automatic Speech Recognition (ASR), Speech Language Identification (Speech LangID), Translation and Retrieval. In this paper, we provide baselines for the tasks based on multilingual pre-trained models like mSLAM. The goal of FLEURS is to enable speech technology in more languages and catalyze research in low-resource speech understanding
    corecore